Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add set_bit to BooleanBufferBuilder to allow mutating bit in index #383

Merged
merged 5 commits into from
Jun 8, 2021

Conversation

boazberman
Copy link
Contributor

Which issue does this PR close?

Relates to apache/datafusion#240 & apache/datafusion#342

Rationale for this change

BooleanBufferBuilder behaves as a bitmap. My intention is to extend its API to allow mutating bits already set, to make it's API closer to the bitvec crate and useable for my use-case in arrow-datafusion.

What changes are included in this PR?

This PR includes the implementation and some tests that currently does not pass. I'm making the PR public to get help from the community, as I'm stuck.

Are there any user-facing changes?

There is a new public method set_bit added to BooleanBufferBuilder.

@Dandandan
Copy link
Contributor

Dandandan commented May 30, 2021

I added some details on the ASF Slack. I think the change is fine, but only the tests should be fixed.

Maybe it makes sense to expose get_bit as well? This also could make the tests a bit more intuitive.

@alamb
Copy link
Contributor

alamb commented May 31, 2021

FYI @tustvold

@alamb
Copy link
Contributor

alamb commented May 31, 2021

Thanks @boazberman !

Here is the Slack link @Dandandan is referring to: https://the-asf.slack.com/archives/C01QUFS30TD/p1622403410176000

FWIW @tustvold implemented some version of this logic in https://github.com/influxdata/influxdb_iox/blob/main/arrow_util/src/bitset.rs in case you want to have a friendly look (and perhaps we can eventually use BooleanBufferBuilder rather than our own bitset creator,

@boaz-codota
Copy link
Contributor

boaz-codota commented Jun 5, 2021

Fix the tests, could you review? @Dandandan @alamb ?

Also, @alamb I noticed that in https://github.com/influxdata/influxdb_iox/blob/main/arrow_util/src/bitset.rs#L87 is differ from what is done here:

// influx_iox
data[i >> 3] |= 1 << (i & 7) 

vs

// arrow-rs
const BIT_MASK: [u8; 8] = [1, 2, 4, 8, 16, 32, 64, 128];
// ...
data[i >> 3] |= BIT_MASK[i & 7];

which I haven't tested but might have performance implications (?)

@@ -302,6 +302,15 @@ impl BooleanBufferBuilder {
self.len
}

#[inline]
pub fn set_bit(&mut self, index: usize, bit: bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn set_bit(&mut self, index: usize, bit: bool) {
pub fn set_bit(&mut self, index: usize, v: bool) {

(as called in other methods)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@codecov-commenter
Copy link

Codecov Report

Merging #383 (d3b89d1) into master (0a16574) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #383      +/-   ##
==========================================
+ Coverage   82.62%   82.65%   +0.02%     
==========================================
  Files         162      162              
  Lines       44479    44542      +63     
==========================================
+ Hits        36751    36814      +63     
  Misses       7728     7728              
Impacted Files Coverage Δ
arrow/src/util/bit_util.rs 100.00% <ø> (ø)
arrow/src/array/builder.rs 85.90% <100.00%> (+0.61%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ddc717...d3b89d1. Read the comment docs.

@alamb
Copy link
Contributor

alamb commented Jun 6, 2021

which I haven't tested but might have performance implications (?)

Hopefully the compiler can figure it out -- I think the way you have it is fine until we get some sort of profiling / performance measurements that show it isn't. 👍

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @boazberman -- thank you!

Would it be possible to fix the rust clippy CI check? Then I think this is ready to merge in.

🥇 for tests 👍

@boaz-codota
Copy link
Contributor

@alamb fixed clippy failures

@boaz-codota
Copy link
Contributor

hey, what is blocking this from being merged? @Dandandan @alamb

@Dandandan Dandandan merged commit 18c804a into apache:master Jun 8, 2021
@Dandandan
Copy link
Contributor

hey, what is blocking this from being merged? @Dandandan @alamb

Merged! Thank you 🚀

@alamb
Copy link
Contributor

alamb commented Jun 8, 2021

hey, what is blocking this from being merged? @Dandandan @alamb

Lack of time :) Sorry about that @boaz-codota -- thanks for following up and thanks for merging @Dandandan 👍

alamb pushed a commit that referenced this pull request Jun 8, 2021
)

* Add set_bit to BooleanBufferBuilder to allow mutating bits in the builder

* Fix tests

* Update builder.rs

* Update builder.rs

* Fix clippy failures

Co-authored-by: Boaz Berman <[email protected]>
alamb added a commit that referenced this pull request Jun 9, 2021
) (#431)

* Add set_bit to BooleanBufferBuilder to allow mutating bits in the builder

* Fix tests

* Update builder.rs

* Update builder.rs

* Fix clippy failures

Co-authored-by: Boaz Berman <[email protected]>

Co-authored-by: Boaz <[email protected]>
Co-authored-by: Boaz Berman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants